Filling the Gap: Semi-Supervised Learning for Opinion Detection Across Domains
نویسندگان
چکیده
We investigate the use of Semi-Supervised Learning (SSL) in opinion detection both in sparse data situations and for domain adaptation. We show that co-training reaches the best results in an in-domain setting with small labeled data sets, with a maximum absolute gain of 33.5%. For domain transfer, we show that self-training gains an absolute improvement in labeling accuracy for blog data of 16% over the supervised approach with target domain training data.
منابع مشابه
Semi-supervised Subspace Co-Projection for Multi-class Heterogeneous Domain Adaptation
Heterogeneous domain adaptation aims to exploit labeled training data from a source domain for learning prediction models in a target domain under the condition that the two domains have different input feature representation spaces. In this paper, we propose a novel semi-supervised subspace co-projection method to address multiclass heterogeneous domain adaptation. The proposed method projects...
متن کاملPredictive Features in Semi-Supervised Learning for Polarity Classification and the Role of Adjectives
In opinion mining, there has been only very little work investigating semi-supervised machine learning on document-level polarity classification. We show that semi-supervised learning performs significantly better than supervised learning when only few labeled data are available. Semi-supervised polarity classifiers rely on a predictive feature set. (Semi-)Manually built polarity lexicons are o...
متن کاملIncremental Learning on Sentiment Analysis Using Weakly Supervised Learning Techniques
Due to the advanced technologies of Web 2.0, people are participating in and exchanging opinions through social media sites such as Web forums and Weblogs etc., Classification and Analysis of such opinions and sentiment information is potentially important for both service and product providers, users because this analysis is used for making valuable decisions. Sentiment is expressed differentl...
متن کاملMEFUASN: A Helpful Method to Extract Features using Analyzing Social Network for Fraud Detection
Fraud detection is one of the ways to cope with damages associated with fraudulent activities that have become common due to the rapid development of the Internet and electronic business. There is a need to propose methods to detect fraud accurately and fast. To achieve to accuracy, fraud detection methods need to consider both kind of features, features based on user level and features based o...
متن کاملUnsupervised Pre-training Across Image Domains Improves Lung Tissue Classification
The detection and classification of anomalies relevant for disease diagnosis or treatment monitoring is important during computational medical image analysis. Often, obtaining sufficient annotated training data to represent natural variability well is unfeasible. At the same time, data is frequently collected across multiple sites with heterogeneous medical imaging equipment. In this paper we p...
متن کامل